Add tiered-storage module with stored fields prefetch support by GeekGlider · Pull Request #20962 · opensearch-project/OpenSearch

GeekGlider · 2026-03-23T09:23:44Z

Description

Adds the tiered-storage module to support writable warm index features. This module introduces stored fields prefetching for tiered storage indices, which prefetches stored fields during the fetch phase of search to improve read performance on warm indices.

The module includes:

TieredStoragePlugin — the main plugin class that registers cluster settings and a search operation listener
StoredFieldsPrefetch — a SearchOperationListener that prefetches stored fields when the feature flag is enabled
TieredStoragePrefetchSettings — cluster-level settings to control prefetch behavior (read-ahead block count and enable/disable toggle)
Unit tests for prefetch settings and the stored fields prefetch listener

Check List

Functionality includes testing.
API changes companion pull request created, if applicable.
Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2026-03-23T09:25:18Z

PR Reviewer Guide 🔍

(Review updated until commit `f0959d8`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Add TieredStoragePrefetchSettings cluster settings Relevant files: modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/TieredStoragePrefetchSettings.java modules/tiered-storage/src/test/java/org/opensearch/storage/prefetch/TieredStoragePrefetchSettingsTests.java Sub-PR theme: Add StoredFieldsPrefetch listener and TieredStoragePlugin Relevant files: modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java modules/tiered-storage/src/main/java/org/opensearch/storage/package-info.java modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/package-info.java modules/tiered-storage/src/test/java/org/opensearch/storage/prefetch/StoredFieldsPrefetchTests.java CHANGELOG.md
⚡ Recommended focus areas for review Possible Issue In `executePrefetch`, when the reader changes to a new segment and the inner reader is not a `SegmentReader`, `currentReader` is set to null and the loop continues. However, `currentReaderIndex` is already updated to `readerIndex`, so subsequent docs in the same non-SegmentReader segment will skip the null check at the top of the loop (since `currentReaderIndex == readerIndex`) and proceed with `currentReader == null`, hitting the `continue` correctly. But `currentReaderContext` will also have been updated to the new context. This is actually fine for the null check path, but the `assert currentReaderContext != null` on line 82 could mask the real issue: if `currentReaderContext` is somehow null (e.g., first iteration), the assert is a no-op in production builds. More critically, `readerIndex` is used in the debug log on line 88 but is declared inside the try block scope — this is a scoping issue that may cause a compile error since `readerIndex` is referenced in the log statement outside its declaration scope. int currentReaderIndex = -1; LeafReaderContext currentReaderContext = null; StoredFields currentReader = null; log.debug("Stored Field Execute prefetch was triggered: {}", context.docIdsToLoadSize()); for (int index = 0; index < context.docIdsToLoadSize(); index++) { int docId = context.docIdsToLoad()[context.docIdsToLoadFrom() + index]; try { int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves()); if (currentReaderIndex != readerIndex) { currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex); currentReaderIndex = readerIndex; // Unwrap the reader here LeafReader innerLeafReader = currentReaderContext.reader(); while (innerLeafReader instanceof FilterLeafReader) { innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate(); } // never be the case, just sanity check if (!(innerLeafReader instanceof SegmentReader)) { // disable prefetch on stored fields for this segment log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName()); currentReader = null; continue; } currentReader = innerLeafReader.storedFields(); } assert currentReaderContext != null; if (currentReader == null) { continue; } log.debug( "Prefetching stored fields for index shard: {}, docId: {}, readerIndex: {}", context.indexShard().shardId(), docId, readerIndex ); // nested docs logic final int subDocId = docId - currentReaderContext.docBase; final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId); if (rootDocId != -1) { currentReader.prefetch(rootDocId); } currentReader.prefetch(subDocId); } catch (Exception e) { log.warn("Failed to prefetch stored fields for docId: " + docId, e); } } Incorrect Prefetch Logic In `findRootDocumentIfNested`, when `bits.get(subDocId)` is false (meaning the doc is a nested/child doc), `bits.nextSetBit(subDocId)` returns the next set bit at or after `subDocId`. This could return `subDocId` itself if the bit is set, but the condition already checked it's not set, so it returns the next parent doc. However, `nextSetBit` may return `NO_MORE_DOCS` (Integer.MAX_VALUE) if there is no parent doc after `subDocId`, which would then be passed to `currentReader.prefetch()` causing an invalid document access. There is no upper-bound check on the return value. private int findRootDocumentIfNested(SearchContext context, LeafReaderContext subReaderContext, int subDocId) throws IOException { if (context.mapperService().hasNested()) { BitSet bits = context.bitsetFilterCache().getBitSetProducer(Queries.newNonNestedFilter()).getBitSet(subReaderContext); if (bits != null && !bits.get(subDocId)) { return bits.nextSetBit(subDocId); } } return -1; } Null Dereference Risk `getPrefetchSettingsSupplier()` returns a supplier that accesses `this.tieredStoragePrefetchSettings`, which is only initialized in `createComponents()`. If `onIndexModule()` is called before `createComponents()` (e.g., during plugin initialization ordering), the supplier will return null. `StoredFieldsPrefetch.checkIfStoredFieldsPrefetchEnabled()` does handle a null settings object gracefully, but this silent null return could mask initialization ordering bugs. public Supplier<TieredStoragePrefetchSettings> getPrefetchSettingsSupplier() { return () -> this.tieredStoragePrefetchSettings; } // @Override // public List<ActionHandler<? extends ActionRequest, ? extends ActionResponse>> getActions() { // if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) { // return List.of( // new ActionHandler<>(HotToWarmTierAction.INSTANCE, TransportHotToWarmTierAction.class), // new ActionHandler<>(WarmToHotTierAction.INSTANCE, TransportWarmToHotTierAction.class), // new ActionHandler<>(CancelTieringAction.INSTANCE, TransportCancelTierAction.class), // new ActionHandler<>(ListTieringStatusAction.INSTANCE, TransportListTieringStatusAction.class), // new ActionHandler<>(GetTieringStatusAction.INSTANCE, TransportGetTieringStatusAction.class)); // } else { // return List.of(); // } // } // @Override // public List<RestHandler> getRestHandlers( // Settings settings, RestController restController, // ClusterSettings clusterSettings, // IndexScopedSettings indexScopedSettings, SettingsFilter settingsFilter, // IndexNameExpressionResolver indexNameExpressionResolver, // Supplier<DiscoveryNodes> nodesInCluster // ) { // if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) { // return List.of( // new RestHotToWarmTierAction(), // new RestWarmToHotTierAction(), // new RestCancelTierAction(), // new RestGetTieringStatusAction(), // new RestListTieringStatusAction()); // } else { // return List.of(); // } // } @Override public Collection<Object> createComponents( Client client, ClusterService clusterService, ThreadPool threadPool, ResourceWatcherService resourceWatcherService, ScriptService scriptService, NamedXContentRegistry xContentRegistry, Environment environment, NodeEnvironment nodeEnvironment, NamedWriteableRegistry namedWriteableRegistry, IndexNameExpressionResolver indexNameExpressionResolver, Supplier<RepositoriesService> repositoriesServiceSupplier // Tracer tracer, // MetricsRegistry metricsRegistry ) { // this.tierActionMetrics = new TierActionMetrics(metricsRegistry); this.tieredStoragePrefetchSettings = new TieredStoragePrefetchSettings(clusterService.getClusterSettings()); // this.threadpool.set(threadPool); return Collections.emptyList(); } @Override public void onIndexModule(IndexModule indexModule) { if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) { // indexModule.addSearchOperationListener(new TieredStorageSearchSlowLog(indexModule.getIndexSettings())); indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier())); } }

github-actions · 2026-03-23T09:25:53Z

PR Code Suggestions ✨

Latest suggestions up to f0959d8

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Handle IOException from storedFields() call `innerLeafReader.storedFields()` can throw an `IOException`, but it is called outside the `try-catch` block that wraps the rest of the per-document logic. This means an `IOException` from `storedFields()` would propagate uncaught and abort the entire prefetch loop. Move the `storedFields()` call inside the `try-catch` block or handle the exception explicitly. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-81] int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves()); if (currentReaderIndex != readerIndex) { currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex); currentReaderIndex = readerIndex; // Unwrap the reader here LeafReader innerLeafReader = currentReaderContext.reader(); while (innerLeafReader instanceof FilterLeafReader) { innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate(); } // never be the case, just sanity check if (!(innerLeafReader instanceof SegmentReader)) { // disable prefetch on stored fields for this segment log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName()); currentReader = null; continue; } - currentReader = innerLeafReader.storedFields(); + try { + currentReader = innerLeafReader.storedFields(); + } catch (IOException e) { + log.warn("Failed to get storedFields for segment, skipping prefetch", e); + currentReader = null; + continue; + } } Suggestion importance[1-10]: 7 __ Why: The `storedFields()` call at line 80 can throw `IOException` but is outside the `try-catch` block, meaning an exception would abort the entire prefetch loop. This is a real robustness issue that should be addressed.	Medium
General	Avoid double-prefetching nested and root documents When a nested document is found, the root document is prefetched first, but the sub-document is always prefetched regardless. If `subDocId` is itself the root document (i.e., `bits.get(subDocId)` is true in `findRootDocumentIfNested`), this is fine, but if `rootDocId != -1` and `rootDocId == subDocId`, the same document is prefetched twice. More critically, if the sub-document is a nested child, prefetching it separately may be unnecessary or wasteful. Consider only prefetching `subDocId` when it is not a nested child (i.e., `rootDocId == -1`). modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [95-99] final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId); if (rootDocId != -1) { currentReader.prefetch(rootDocId); +} else { + currentReader.prefetch(subDocId); } -currentReader.prefetch(subDocId); Suggestion importance[1-10]: 5 __ Why: The suggestion raises a valid concern about potentially redundant prefetching when a nested document's root is found. However, the original code's intent may be to prefetch both the root and the child document for completeness, and the behavior difference is subtle. The improvement is reasonable but not critical.	Low
General	Guard against uninitialized settings in onIndexModule `onIndexModule` is called once per index, so a new `StoredFieldsPrefetch` instance is created for every index. Since `StoredFieldsPrefetch` is stateless (it only holds a supplier reference), this is not a bug per se, but it is wasteful. More importantly, if `tieredStoragePrefetchSettings` has not yet been initialized (i.e., `createComponents` has not been called), the supplier will return `null`, and `checkIfStoredFieldsPrefetchEnabled` will silently return `false` rather than failing visibly. Consider adding a guard or assertion to ensure `tieredStoragePrefetchSettings` is initialized before `onIndexModule` is invoked. modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [167-172] public void onIndexModule(IndexModule indexModule) { if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) { - // indexModule.addSearchOperationListener(new TieredStorageSearchSlowLog(indexModule.getIndexSettings())); + if (tieredStoragePrefetchSettings == null) { + throw new IllegalStateException("TieredStoragePrefetchSettings not initialized; createComponents must be called before onIndexModule"); + } indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier())); } } Suggestion importance[1-10]: 4 __ Why: While the concern about `tieredStoragePrefetchSettings` being null is valid, the `checkIfStoredFieldsPrefetchEnabled` method already handles the null case gracefully by returning `false`. Throwing an `IllegalStateException` would be a breaking change in behavior and may not be appropriate for a plugin lifecycle scenario.	Low

Previous suggestions

Suggestions up to commit 4e87c8d

Category	Suggestion	Impact
Possible issue	Avoid double-prefetching nested document stored fields When a document is nested, `subDocId` is a child document and `rootDocId` is the parent. Prefetching `subDocId` after `rootDocId` may be redundant or incorrect — nested child documents typically don't have their own stored fields entry separate from the root. You should only prefetch the root document when nesting is detected, and skip prefetching `subDocId` in that case to avoid unnecessary or erroneous I/O. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [95-99] final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId); if (rootDocId != -1) { currentReader.prefetch(rootDocId); +} else { + currentReader.prefetch(subDocId); } -currentReader.prefetch(subDocId); Suggestion importance[1-10]: 7 __ Why: The current code prefetches both `rootDocId` and `subDocId` when nesting is detected, but nested child documents share stored fields with the root document. Prefetching `subDocId` after `rootDocId` is redundant and potentially incorrect. The fix to use `else` is logically sound and avoids unnecessary I/O.	Medium
Possible issue	Handle IOException from storedFields() call safely `innerLeafReader.storedFields()` can throw an `IOException`, but it is called outside the `try/catch` block that wraps the rest of the per-document logic. This means an exception here would propagate uncaught and abort the entire prefetch loop. Move the `storedFields()` call inside the `try/catch` block or handle the exception explicitly. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-81] int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves()); if (currentReaderIndex != readerIndex) { currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex); currentReaderIndex = readerIndex; // Unwrap the reader here LeafReader innerLeafReader = currentReaderContext.reader(); while (innerLeafReader instanceof FilterLeafReader) { innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate(); } // never be the case, just sanity check if (!(innerLeafReader instanceof SegmentReader)) { // disable prefetch on stored fields for this segment log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName()); currentReader = null; continue; } - currentReader = innerLeafReader.storedFields(); + try { + currentReader = innerLeafReader.storedFields(); + } catch (IOException e) { + log.warn("Failed to get storedFields for segment, skipping prefetch", e); + currentReader = null; + continue; + } } Suggestion importance[1-10]: 6 __ Why: The `innerLeafReader.storedFields()` call at line 80 can throw `IOException` but is inside the outer `try/catch(Exception e)` block that wraps the entire per-document logic (lines 62-102), so it is already handled. However, the suggestion to handle it more specifically with a `continue` is a minor improvement for clarity and resilience.	Low
General	Reuse single listener instance across all index modules `onIndexModule` is called once per index, potentially creating a new `StoredFieldsPrefetch` instance for every index. Since `StoredFieldsPrefetch` is stateless beyond its settings supplier, a single shared instance should be created (e.g., in `createComponents`) and reused across all index modules to avoid unnecessary object allocation. modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [167-172] +// In createComponents, add: +// this.storedFieldsPrefetch = new StoredFieldsPrefetch(getPrefetchSettingsSupplier()); + +@Override public void onIndexModule(IndexModule indexModule) { if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) { - // indexModule.addSearchOperationListener(new TieredStorageSearchSlowLog(indexModule.getIndexSettings())); - indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier())); + indexModule.addSearchOperationListener(storedFieldsPrefetch); } } Suggestion importance[1-10]: 4 __ Why: Creating a new `StoredFieldsPrefetch` per index module is wasteful since the class is stateless beyond its settings supplier. Reusing a single instance is a valid optimization, though the performance impact is minor. The `improved_code` uses a comment placeholder rather than showing the full implementation, making it less actionable.	Low

Suggestions up to commit b00a2bc

Category	Suggestion	Impact
Possible issue	Handle IOException from storedFields() acquisition `innerLeafReader.storedFields()` can throw an `IOException`, but it is called outside the `try-catch` block that wraps the rest of the per-document logic. This means an `IOException` from `storedFields()` would propagate uncaught and abort the entire prefetch loop. Move the `storedFields()` call inside the `try-catch` block or handle the exception explicitly. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-81] int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves()); if (currentReaderIndex != readerIndex) { currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex); currentReaderIndex = readerIndex; // Unwrap the reader here LeafReader innerLeafReader = currentReaderContext.reader(); while (innerLeafReader instanceof FilterLeafReader) { innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate(); } // never be the case, just sanity check if (!(innerLeafReader instanceof SegmentReader)) { // disable prefetch on stored fields for this segment log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName()); currentReader = null; continue; } - currentReader = innerLeafReader.storedFields(); + try { + currentReader = innerLeafReader.storedFields(); + } catch (IOException e) { + log.warn("Failed to get storedFields for segment, skipping prefetch", e); + currentReader = null; + continue; + } } Suggestion importance[1-10]: 7 __ Why: The `innerLeafReader.storedFields()` call can throw `IOException` but is currently outside the `try-catch` block, meaning an exception would propagate uncaught and abort the entire prefetch loop. Wrapping it in a try-catch ensures robustness and consistent error handling.	Medium
Possible issue	Avoid redundant prefetch for nested child documents When a document is nested, `subDocId` is a child document and `rootDocId` is the parent. Prefetching `subDocId` after `rootDocId` may be redundant or incorrect — nested child documents are typically stored within the root document's stored fields block. You should only prefetch the root document when nesting is detected, and skip prefetching the child `subDocId` separately to avoid unnecessary or duplicate I/O. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [97-101] final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId); if (rootDocId != -1) { currentReader.prefetch(rootDocId); +} else { + currentReader.prefetch(subDocId); } -currentReader.prefetch(subDocId); Suggestion importance[1-10]: 6 __ Why: When a document is nested, prefetching both `rootDocId` and `subDocId` may result in redundant I/O since nested child documents are stored within the root document's block. Using an `else` branch to only prefetch `subDocId` when there's no nesting is a reasonable optimization, though the actual behavior depends on the Lucene storage implementation.	Low
General	Reuse existing supplier method to avoid code duplication The supplier passed to `StoredFieldsPrefetch` can return `null` if `tieredStoragePrefetchSettings` is not yet initialized. However, `getPrefetchSettingsSupplier()` already handles this case with the same null check. Using `getPrefetchSettingsSupplier()` here would be more consistent and avoid duplicating the null-check logic inline. modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [170-175] -indexModule.addSearchOperationListener(new StoredFieldsPrefetch(() -> { - if (this.tieredStoragePrefetchSettings == null) { - return null; - } - return this.tieredStoragePrefetchSettings; -})); +indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier())); Suggestion importance[1-10]: 5 __ Why: The inline lambda duplicates the null-check logic already present in `getPrefetchSettingsSupplier()`. Using the existing method improves consistency and reduces code duplication, though the impact is minor.	Low

Suggestions up to commit a6eb66a

Category	Suggestion	Impact
Possible issue	Handle IOException from storedFields() acquisition `innerLeafReader.storedFields()` can throw an `IOException`, but it is called outside the `try/catch` block that wraps the rest of the per-document logic. This means an `IOException` from `storedFields()` would propagate uncaught and abort the entire prefetch loop. Move the `storedFields()` call inside the `try/catch` block or handle the exception explicitly. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-81] int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves()); if (currentReaderIndex != readerIndex) { currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex); currentReaderIndex = readerIndex; // Unwrap the reader here LeafReader innerLeafReader = currentReaderContext.reader(); while (innerLeafReader instanceof FilterLeafReader) { innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate(); } // never be the case, just sanity check if (!(innerLeafReader instanceof SegmentReader)) { // disable prefetch on stored fields for this segment log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName()); currentReader = null; continue; } - currentReader = innerLeafReader.storedFields(); + try { + currentReader = innerLeafReader.storedFields(); + } catch (IOException e) { + log.warn("Failed to get storedFields for segment, skipping prefetch", e); + currentReader = null; + continue; + } } Suggestion importance[1-10]: 7 __ Why: The `innerLeafReader.storedFields()` call can throw `IOException` but is currently outside the `try/catch` block, meaning an exception would abort the entire prefetch loop. Moving it inside the catch block prevents this failure mode and improves robustness.	Medium
General	Avoid redundant prefetch of nested child documents When a document is nested, `subDocId` is a child document and `rootDocId` is the parent. Prefetching `subDocId` when it is a nested (non-root) document may be unnecessary or incorrect, since the stored fields of interest are on the root document. Consider only prefetching `subDocId` when it is not a nested child (i.e., when `rootDocId ==` `-1`). modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [97-101] final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId); if (rootDocId != -1) { currentReader.prefetch(rootDocId); +} else { + currentReader.prefetch(subDocId); } -currentReader.prefetch(subDocId); Suggestion importance[1-10]: 6 __ Why: When a document is nested, prefetching both the root and the child `subDocId` may be redundant or incorrect since stored fields are on the root document. The suggested change to only prefetch `subDocId` when it's not a nested child is a reasonable optimization, though the impact depends on the actual data model.	Low
General	Restrict prefetch listener to tiered-storage indices only The `StoredFieldsPrefetch` listener is registered for every index module when the feature flag is enabled, regardless of whether the index is actually a tiered-storage index. This could cause unnecessary overhead on all indices. Consider adding a check on the index settings or index type before registering the listener. modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [170-175] -indexModule.addSearchOperationListener(new StoredFieldsPrefetch(() -> { - if (this.tieredStoragePrefetchSettings == null) { - return null; - } - return this.tieredStoragePrefetchSettings; -})); +if (indexModule.getIndexSettings().getSettings().get("index.store.type", "").contains("tiered")) { + indexModule.addSearchOperationListener(new StoredFieldsPrefetch(() -> { + if (this.tieredStoragePrefetchSettings == null) { + return null; + } + return this.tieredStoragePrefetchSettings; + })); +} Suggestion importance[1-10]: 4 __ Why: The suggestion to restrict the listener to tiered-storage indices is a valid design concern, but the `improved_code` uses a hardcoded string check on `index.store.type` that may not be the correct mechanism in this codebase. The approach is speculative and may not reflect the actual index type detection pattern used in OpenSearch.	Low

Suggestions up to commit 799eb95

Category	Suggestion	Impact
Possible issue	Skip only affected segment on unexpected reader type When an unexpected (non-`SegmentReader`) reader type is encountered, the method returns immediately, skipping all remaining documents across all segments — not just the current segment. This is overly aggressive; the issue may be isolated to one segment. Consider using `continue` to skip only the current segment and attempt prefetching for subsequent segments. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-80] int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves()); if (currentReaderIndex != readerIndex) { currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex); currentReaderIndex = readerIndex; // Unwrap the reader here LeafReader innerLeafReader = currentReaderContext.reader(); while (innerLeafReader instanceof FilterLeafReader) { innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate(); } // never be the case, just sanity check if (!(innerLeafReader instanceof SegmentReader)) { // disable prefetch on stored fields for this case log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName()); - return; + currentReader = null; + continue; } currentReader = innerLeafReader.storedFields(); } +if (currentReader == null) { + continue; +} Suggestion importance[1-10]: 7 __ Why: The current `return` statement aborts prefetching for all remaining documents when a single segment has an unexpected reader type. Using `continue` with a null guard is a more targeted fix that allows other segments to still be prefetched, improving robustness.	Medium
Possible issue	Avoid duplicate prefetch for nested documents When a document is nested, `subDocId` is a child document and `rootDocId` is the parent. Prefetching `subDocId` after `rootDocId` may be redundant or incorrect since nested child documents' stored fields are typically stored with the root document. More critically, if `rootDocId == subDocId` (i.e., the doc is already the root), both prefetch calls will be made unnecessarily. Consider skipping the `subDocId` prefetch when a valid `rootDocId` is found. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [93-97] final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId); if (rootDocId != -1) { currentReader.prefetch(rootDocId); +} else { + currentReader.prefetch(subDocId); } -currentReader.prefetch(subDocId); Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies that when a nested root document is found, prefetching `subDocId` (the child) is likely redundant since stored fields are typically stored with the root. Using `else` avoids the unnecessary second prefetch call, improving efficiency.	Low
General	Handle missing root document for nested docs `BitSet.nextSetBit(subDocId)` may return `-1` if no set bit exists at or after `subDocId`, which would be misinterpreted by the caller as "no root document found" — but `-1` is also the sentinel value for "not nested". While unlikely in practice, if `nextSetBit` returns `-1` (no parent found), the caller would skip the prefetch silently. Add a guard to handle this edge case explicitly. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [104-112] private int findRootDocumentIfNested(SearchContext context, LeafReaderContext subReaderContext, int subDocId) throws IOException { if (context.mapperService().hasNested()) { BitSet bits = context.bitsetFilterCache().getBitSetProducer(Queries.newNonNestedFilter()).getBitSet(subReaderContext); if (bits != null && !bits.get(subDocId)) { - return bits.nextSetBit(subDocId); + int rootDocId = bits.nextSetBit(subDocId); + if (rootDocId != -1) { + return rootDocId; + } + log.warn("Could not find root document for nested docId: {}", subDocId); } } return -1; } Suggestion importance[1-10]: 4 __ Why: The suggestion correctly notes that `nextSetBit` can return `-1`, but in practice this is already handled correctly since `-1` is the sentinel value and the caller checks `rootDocId != -1`. The added warning log provides minor diagnostic value but the functional change is minimal.	Low

Suggestions up to commit a3c9bdd

Category	Suggestion	Impact
Possible issue	Add null check before accessing BitSet The `bits` variable returned by `getBitSet` can be `null` if no documents match the filter for the given leaf context. Calling `bits.get(subDocId)` without a null check will throw a `NullPointerException` at runtime. Add a null guard before accessing the `BitSet`. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [104-112] private int findRootDocumentIfNested(SearchContext context, LeafReaderContext subReaderContext, int subDocId) throws IOException { if (context.mapperService().hasNested()) { BitSet bits = context.bitsetFilterCache().getBitSetProducer(Queries.newNonNestedFilter()).getBitSet(subReaderContext); - if (!bits.get(subDocId)) { + if (bits != null && !bits.get(subDocId)) { return bits.nextSetBit(subDocId); } } return -1; } Suggestion importance[1-10]: 8 __ Why: The `getBitSet` method can return `null` when no documents match the filter for a given leaf context, and calling `bits.get(subDocId)` without a null check would cause a `NullPointerException`. This is a real bug that needs to be fixed.	Medium
Possible issue	Guard against null settings before component initialization `getPrefetchSettingsSupplier()` returns a supplier that accesses `this.tieredStoragePrefetchSettings`, which is only initialized in `createComponents()`. If `onIndexModule` is called before `createComponents()`, the supplier will return `null`, causing a `NullPointerException` when `isStoredFieldsPrefetchEnabled()` is invoked. Add a null check inside the supplier or ensure initialization order is guaranteed. modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [166-172] @Override public void onIndexModule(IndexModule indexModule) { if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) { - indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier())); + indexModule.addSearchOperationListener(new StoredFieldsPrefetch(() -> { + if (this.tieredStoragePrefetchSettings == null) { + return null; + } + return this.tieredStoragePrefetchSettings; + })); } } Suggestion importance[1-10]: 7 __ Why: `tieredStoragePrefetchSettings` is only initialized in `createComponents()`, and if `onIndexModule` is called before that, the supplier returns `null`, causing a `NullPointerException`. The null guard in the improved code addresses this lifecycle issue, though the `StoredFieldsPrefetch` class would also need to handle a `null` settings return.	Medium
General	Handle prefetch errors gracefully without aborting Throwing an exception on the first prefetch failure aborts prefetching for all remaining documents. Since prefetching is an optimization (not correctness-critical), it would be more resilient to log the error and continue processing the remaining documents rather than propagating the exception and potentially failing the entire fetch phase. modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [98-100] } catch (Exception e) { - throw ExceptionsHelper.convertToOpenSearchException(e); + log.warn("Failed to prefetch stored fields for docId: " + docId, e); } Suggestion importance[1-10]: 6 __ Why: Since prefetching is an optimization, throwing an exception on failure aborts the entire fetch phase unnecessarily. Logging and continuing would be more resilient, though this is a design decision that may be intentional.	Low
General	Avoid double-prefetching nested and root documents When a document is a root document (not nested), `findRootDocumentIfNested` returns `-1` and `subDocId` is prefetched correctly. However, when a document IS nested, both `rootDocId` and `subDocId` are prefetched. But `subDocId` itself may be a nested (child) document that doesn't have stored fields — prefetching it could be wasteful or erroneous. Consider only prefetching `subDocId` when it is not a nested child (i.e., when `rootDocId == -1`). modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [93-97] final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId); if (rootDocId != -1) { currentReader.prefetch(rootDocId); +} else { + currentReader.prefetch(subDocId); } -currentReader.prefetch(subDocId); Suggestion importance[1-10]: 5 __ Why: The suggestion raises a valid concern about prefetching nested child documents that may not have stored fields, but the behavior may be intentional. The improved code changes the logic significantly and could affect correctness depending on the use case.	Low

github-actions · 2026-03-23T09:27:21Z

Persistent review updated to latest commit a3c9bdd

github-actions · 2026-03-23T10:03:36Z

Persistent review updated to latest commit 799eb95

github-actions · 2026-03-23T10:21:56Z

Persistent review updated to latest commit a6eb66a

github-actions · 2026-03-23T10:59:58Z

❌ Gradle check result for a6eb66a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2026-03-23T11:26:51Z

Persistent review updated to latest commit b00a2bc

github-actions · 2026-03-23T12:05:23Z

❌ Gradle check result for b00a2bc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2026-03-23T12:14:10Z

Persistent review updated to latest commit 4e87c8d

github-actions · 2026-03-23T12:51:53Z

❌ Gradle check result for 4e87c8d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kavya Aggarwal <kavyaagg@amazon.com>

github-actions · 2026-03-23T13:07:07Z

Persistent review updated to latest commit f0959d8

github-actions · 2026-03-23T13:49:41Z

❌ Gradle check result for f0959d8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

kkewwei · 2026-03-23T14:02:44Z

@GeekGlider Same with PR #20176.

Could you explain why a new module is required to implement this functionality? If the requirement is simply to add prefetching for stored fields, creating a separate module may not be appropriate.

GeekGlider requested a review from a team as a code owner March 23, 2026 09:23

GeekGlider force-pushed the feature/tiered-storage-module branch from eb42398 to a3c9bdd Compare March 23, 2026 09:25

GeekGlider force-pushed the feature/tiered-storage-module branch from a3c9bdd to 799eb95 Compare March 23, 2026 10:02

GeekGlider force-pushed the feature/tiered-storage-module branch from 799eb95 to a6eb66a Compare March 23, 2026 10:20

GeekGlider force-pushed the feature/tiered-storage-module branch from a6eb66a to b00a2bc Compare March 23, 2026 11:25

GeekGlider force-pushed the feature/tiered-storage-module branch from b00a2bc to 4e87c8d Compare March 23, 2026 12:12

This was referenced Mar 23, 2026

[AUTOCUT] Gradle Check Flaky Test Report for IngestFromKinesisIT #17678

Open

[AUTOCUT] Gradle Check Flaky Test Report for SegmentReplicationWithNodeToNodeIndexShardTests #20860

Open

Add tiered-storage module with stored fields prefetch support

f0959d8

Signed-off-by: Kavya Aggarwal <kavyaagg@amazon.com>

GeekGlider force-pushed the feature/tiered-storage-module branch from 4e87c8d to f0959d8 Compare March 23, 2026 13:05

Conversation

GeekGlider commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Check List

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit f0959d8)

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

kkewwei commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GeekGlider commented Mar 23, 2026 •

edited

Loading

github-actions bot commented Mar 23, 2026 •

edited

Loading

(Review updated until commit `f0959d8`)

github-actions bot commented Mar 23, 2026 •

edited

Loading